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We report the sequencing of the basidiomycetous yeast Rhodosporidium toruloides CECT1137. The current assembly comprises 
62 scaffolds, for a total size of ca. 20.45 Mbp and a G+C content of ca. 61.9%. The genome annotation predicts 8,206 putative 
protein-coding genes. 
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Large-scale production of biodiesel requires considerable 
amounts of fatty acids (FAs) . The resulting demand for FAs has 
made oleaginous yeast species a target of choice as an alternative 
source of oils and fats (1). The genomic data available for such 
species have increased over the last few years but still remain 
scarce. To increase the current knowledge of oleaginous yeast spe- 
cies, we report the sequencing of the basidiomycetous yeast Rho- 
dosporidium toruloides CECTl 137. This strain was originally de- 
posed in the Natural Collection of Yeast Cultures (NCYC) 
(United Kingdom) in 1925 by Antoine GuiUermond (strain 
NCYC162) under the name "Levure de rose." It was transferred in 
1984 to the Spanish Type Culture Collection (CECT) (Spain), 
where it received the CECTl 137 accession number. This strain 
was originally described as Rhodotorula glutinis var. glutinis based 
on physiological characteristics. However, our sequencing data 
reinstated it as a member of the species R. toruloides. 

The CECTl 137 genome was sequenced by Eurofins Scientific 
(France), using Roche 454 GS FIX Titanium, on a single-read 
library and an 8-kb mate-pair library. A complementary run of 
lUumina HiSeq 2000 was performed on a cDNA library using 
lUumina TruSeq RNA to improve the predictions of the gene 
models and their numerous introns. Several assembly runs were 
performed using a combination of MIRA (2) and Allpaths-LG (3). 
Smoothing and scaffolding were performed using PILON (ver- 
sion 1.7; The Broad Institute [http://www.broadinstitute.org/ 
software/pilon/]) and SSPACE-BASIC-2.0 (4). Genes were pre- 
dicted using a combination of tools, including Augustus (5), 
GeneMark (6), EST2Genome (7), BLASTp, and PSItblastn (8). 
The recently published genome of the closely related strain NP 1 1 
was integrated in the annotation pipeline as an additional support 
for prediction (9). The cDNA reads were mapped on the genome 
using Tophat2 (10). Exon-exon junctions were extracted from the 
alignments using a combination of in-house tools developed in 
BioPerl (11). Splicing events and structural annotation were man- 
ually validated using the Artemis software (12). The tRNA genes 
were identified using tRNAscan-SE (13). 

The current draft comprises 62 scaffolds, for a total size of 
20,445,260 bp and a G -I- C content of ca. 61.9%. Overall, 8,206 



putative protein-coding genes have been identified, 212 of which 
harbor introns alternatively spliced within the coding sequences 
and/or the untranslated regions (UTR). An additional 145 genes 
have been annotated as dubious models or pseudogenes, with 
frameshifts, stops in translation, or dubious starts or stops. The 
genome contains 149 tRNAs and 249 miscellaneous RNAs. When- 
ever possible, a functional annotation was proposed, based on a 
combination of BLASTp (8), EMBOSS Needle (7), and Inter- 
ProScan (14) against subsets of genes extracted from UniProt 
(15), with priority given to experimentally validated data from 
yeasts and fungal species. A total of 8,294 proteins were predicted 
(including 88 splicing variants), among which 7,080 showed at 
least 20% sequence similarity on 70% of an alignment with at least 
one gene from the Swiss-Prot/TrEMBL subsets. 

Further comparison of the genome of CECTl 137 against other 
yeast species will bring additional insights on the genomic prop- 
erties of oleaginicity, providing potential targets for future bio- 
technological applications. 

Nucleotide sequence accession numbers. This whole-genome 
shotgun project has been deposited at the European Nucleotide 
Archive under the accession no. LK052936 to LK052997. 
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